home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Night Owl 9
/
Night Owl CD-ROM (NOPV9) (Night Owl Publisher) (1993).ISO
/
051a
/
tbav603.zip
/
TBGENSIG.DOC
< prev
next >
Wrap
Text File
|
1993-06-15
|
21KB
|
661 lines
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
Table of Contents
1. INTRODUCTION...................................... 2
1.1. Purpose of TbGenSig......................... 2
1.2. General information......................... 2
2. DEFINING SIGNATURES............................... 2
2.1. Format of the UserSig.Dat file.............. 2
2.2. Adding a published signature................ 3
2.3. Defining a signature with TbScan............ 3
3. ADVANCED FEATURES................................. 5
3.1. Keywords.................................... 5
3.1.1. Item keywords......................... 5
3.1.2. Message keywords...................... 6
3.1.3. Position keywords..................... 6
3.2. Wildcards................................... 8
3.2.1. Position wildcards.................... 8
3.2.1.1. Skip............................ 8
3.2.1.2. Variable........................ 8
3.2.2. Opcode wildcards...................... 8
3.2.2.1. Low opcode...................... 8
3.2.2.2. High opcode..................... 8
3.3. Example..................................... 9
Page i
Page 1
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
1. INTRODUCTION
1.1. Purpose of TbGenSig
TbGenSig is a signature file compiler. Since TBAV is distributed
with an up to date, ready-to-use signature file, you don't really
need the signature file compiler.
However, you need the signature file compiler if you want to define
your own virus signatures. You can used either published signatures
or define your own ones if you are familiar with the structure of
software.
In both cases, you only need to do this in emergency situations,
like the unfortunate event that your machine or even company is
attacked by an yet unknown and thus not recognized virus. It is
recommended to send a few samples of the virus to some virus
experts anyway, in order to make scanners to recognize the virus in
the next versions of the scanners.
It isn't possible to explain the whole subject of virus hunting in
one manual, so this document assumes that you have enough
expenrience and knowledge to make your own signatures.
1.2. General information
TbGenSig searches for a file name UserSig.Dat in the current
directory. This file should contain the signatures you want to add
to the TBAV signature file TbScan.Sig. TbGenSig checks the contents
of the UserSig.Dat file and applies it to the TbScan.Sig file.
If you want to delete or modify your signatures, just edit or
delete the UserSig.Dat file and run TbGenSig again.
TbGenSig will list all signatures in the TbScan.Sig file on the
screen when running.
2. DEFINING SIGNATURES
2.1. Format of the UserSig.Dat file
You can create and edit the UserSig.Dat file with every DOS text
editor able to output unformatted text.
All lines starting with ';' are comment lines. TbGenSig ignores
these lines.
Lines starting with '%' will be displayed in the upper TbGenSig
Page 2
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
window.
In the first line the name of a virus is expected. The second line
contains one or more keywords. The third line contains the
signature itself. This combination of three lines is named a
signature record.
An signature record should look like this:
Test virus
exe com inf
abcd21436587abcd
It is allowed to use spaces in the signature for your own
convenience. TbGenSig will ignore those spaces.
2.2. Adding a published signature
If you want to add a signature that has been published, do the
following.
- Edit or create the UserSig.Dat file. Convert the published
signature to an acceptable format for TbGenSig.
- Use keywords COM EXE BOOT INF
You would get:
New virus
exe com boot inf
1234abcd5678efab
- Execute TbGenSig.
2.3. Defining a signature with TbScan
This chapter is intended for advanced users who own a TBAV.KEY
file or a Thunderbyte add-on card.
Although the TbScan.Sig file is updated frequently, new viruses are
created each day, outpacing the regular upgrading service of this
data file. It is therefore possible that one day your system gets
infected by a recently created virus that has not yet been listed
in the signature file. TbScan will not always detect the virus
in such cases, not even with the heuristic analysis. If you are
convinced that your system must have been infected without TbScan
confirming this, this chapter will supply you with a valuable tool
to detect undocumented viruses with. We offer you step-by-step
assistance here in creating an emergency signature that can be
(temporarily) added to your copy of TbScan.Sig
Page 3
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
- Collect some infected files and copy them into a temporary
directory.
- Boot from a clean write-protected diskette. Do NOT execute ANY
program from the infected system, even though you expect this
program to be clean.
- Execute TbScan from your write-protected TbScan diskette with
the 'extract' option set. Make sure that the temporary directory
where you put the infected files will be TbScan's target
directory. With its 'extract' option set, TbScan will NOT scan
the files but, instead, display the first instructions that are
found at the entry-point of the infected programs. Please note
that we highly recommend you to simultaneously set the
'session' option of TbScan to generate a log file.
- Compare the 'signatures' extracted by TbScan. You should see
something like this:
NOVIRUS1.COM 2E67BCDEAB129090909090ABCD123490CD
NOVIRUS2.COM N/A
VIRUS1.COM 1234ABCD5678EFAB909090ABCD123478FF
VIRUS2.COM 1234ABCD5678EFAB901234ABCD123478FF
VIRUS3.COM 1234ABCD5678EFAB9A5678ABCD123478FF
If the 'signatures' are completely different, the files are
probably not infected, else they have been infected by a
polymorphic virus that requires an AVR module to detect it.
- There might be some differences in the 'signatures'. You can
use the question mark wildcard ('?') in this case.
A signature to detect the 'virus' in the example above could be:
1234ABCD5678EFAB ?3 ABCD123478FF
The '?3' means that there are three bytes on that position that
should be skipped.
- Add the signature to the data file UserSig.Dat file. Give the
virus a name in the first line of its entry. Specify the
following keywords: COM, EXE, INF, ATE keywords in the second
line. Enter the signature on the third.
You would get:
New virus
exe com ate inf
1234abcd5678efab?3abcd123478ff
- Execute TbGenSig. Make sure the resulting TbScan.Sig file is in
the TbScan directory.
Page 4
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
- Run TbScan again in the directory containing the infected
files. TbScan should now detect the virus.
- Send a couple of infected files to a recommended virus expert,
preferably to us.
Congratulations! You have defined a signature all by yourself! Now
you can scan all your machines in search of the new virus.
However, keep in mind that this method of extracting a signature
is a 'quick-and-dirty' solution to viral problems. The extracted
signature might not detect the presence of the virus in all cases. A
signature that is guaranteed to detect all instances of the virus
can be made only after complete disassembly of the new virus. For
these reasons you should NOT distribute your home-made 'signature'
to others. The signature eventually assembled by experienced
anti-virus researchers will be completely different in most cases!
3. ADVANCED FEATURES
3.1. Keywords
Keywords are used for several purposes. They are classified in
categories.
Keywords may be separated by spaces, commas or tabs. The maximum
line length is 80 bytes.
At least one of the following flags should be specified:
BOOT, COM, EXE, HIGH, LOW, SYS or WIN.
3.1.1. Item keywords
BOOT Signature can be found in bootsector/partition code.
COM Signature can be found in COM programs.
This flag causes the scanner to search for this signature
in executable files that do not have an EXE header or
device header.
Note:
The file contents determines the file type, not the
filename extension!
EXE Signature can be found in EXE programs.
This flag causes the scanner to search for this signature
in the load module of EXE type files. EXE files are files
that have an EXE header.
Note:
The file contents determines the file type, not the
filename extension!
Page 5
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
HIGH Signature can be found in HIGH memory (above program).
This flag causes the scanner to search for this signature
in memory above the memory allocated by the scanner.
This keyword is intended for resident viruses that allocate
memory at system boot, or viruses that decrease the size of
the last MCB (Memory Control Block).
Note:
The flag HIGH does not mean that the signature should be
searched in Upper memory.
LOW Signature can be found in LOW memory.
This flag causes the scanner to search for this signature
in memory below the PSP (Program Segment Prefix) of the
scanner and in the UMB (Upper Memory Blocks).
This keyword is intended for viruses that remain resident
in memory, using the normal DOS TSR (Terminate and Stay
Resident) function calls.
SYS Signature can be found in SYS programs.
WIN Signature can be found in Windows programs.
3.1.2. Message keywords
DAM Message prefix: 'damaged by'.
DROP Message prefix: 'dropper of'.
FND Message prefix: 'found the'.
INF Message prefix: 'infected by' Message suffix: 'virus'
JOKE Message prefix: 'joke named'.
OVW Message prefix: 'overwritten by'.
PROB Message pre-prefix: 'probably'.
TROJ Message prefix: 'trojanized by'.
3.1.3. Position keywords
UATE Signature should be found at unresolved entry point.
Purpose:
The signature starts directly at the unresolved entry-point
of the virus-code. With some polymorphic viruses, it may be
possible to create a signature from the degarbling routine,
but it may be too short or it may give false positives with
a global search. An initial branch instruction may make
part of the signature.
COM type files: top of file (IP 0100h).
EXE type files: CS:IP as defined in the EXE-header.
WIN type files: Non-DOS CS:IP of the new EXE-header.
Remarks:
Page 6
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
The keyword UATE is not allowed for BOOT, SYS, LOW, HMA or
HIGH type signatures.
ATE Signature should be found AT ENTRY point.
Purpose:
The signature starts directly at the entry-point of the
virus-code. With some polymorphic viruses, it may be
possible to create a signature from the degarbling routine,
but it may be too short or it may give false positives with
a global search.
Therefor the keyword ATE is used to make sure that the
scanners do not scan the entire file for the signature, but
only looks at the entry-point for the signature.
The entry-point of a virus is defined by the first byte
that is not equal to either a JUMP SHORT, JUMP LONG or a
CALL NEAR.
Unresolved entry point: 1 JUMP LONG 3
2 ...
3 JUMP SHORT 5
4 ...
5 CALL FAR 7
6 ...
7 CALL NEAR 9
8 ...
Resolved entry point: 9 POP <reg>
The entry-point of the above fragment is Line 9 as this is
the first code to be executed which is not a JUMP SHORT,
JUMP LONG or CALL NEAR or CALL FAR.
Remarks:
1) The entry-point can be determined by a code analyzer to
cope with tricks like coding a NOP or DEC just before the
branch instruction. Therefore the results of the scanner
should be tested carefully. In case of trouble use the
TbScan 'extract' option to find out what TbScan considers
to be the entry point of the program.
2) The flag UATE is not allowed for BOOT, SYS, LOW, HMA or
HIGH type signatures.
XHD Signature can be found at offset 2 of EXE file.
Purpose:
This position keyword is rarely used. It should only be
used to detect the also very rare high-level language
viruses; viruses written in a language like C or Basic.
These viruses normally contain standard setup routines and
Page 7
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
library routines which are not suitable to define a
signature. The XHD keyword can be used as a last resort to
detect such viruses.
Remarks:
This flag may only be used for EXE or WIN type signatures.
3.2. Wildcards
In a virus signature, wildcards characters may be used to recognize
so called polymorphic (self- modifying/mutating) virus code. Below
is a description of the wildcard notation. All numbers are in
hexadecimal.
3.2.1. Position wildcards.
Position wildcards affect the position where the parts of the
signature will be matched.
3.2.1.1. Skip
?n = Skip n amount of bytes and continue.
?@nn = Skip nn amount of bytes and continue.
nn should not exceed 7F.
3.2.1.2. Variable
*n = Skip up to n bytes.
*@nn = Skip up to nn bytes and continue.
nn should not exceed 1F.
3.2.2. Opcode wildcards.
The 'opcode' wildcards are shaped to detect instruction ranges:
3.2.2.1. Low opcode
nL = One of the values in the range n0-n7.
3.2.2.2. High opcode
nH = One of the values in the range n8-nF.
Intended use of the opcode wildcards:
Suppose a polymorphic virus puts a value in a word register (using
a MOV WREG,VALUE instruction), and increments a register (using an
INC WREG instruction, and pops a word register from the stack
(using a POP instruction). The registers are variable and
Page 8
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
the value too. You could code it like this:
bh4l5h
B8-BF are the opcodes for 'MOV WREG,VALUE', 40-47 are the
opcodes for 'INC WREG', and 58-5F are the opcodes for 'POP REG'.
3.3. Example
To show the power of the use of the appropriate keywords and
wildcards here is the signature of the Haifa.Mozkin virus.
This virus is highly polymorphic and encrypted. It uses a small
variable decryptor to decrypt the virus.
There are two problems here: most bytes are encrypted or variable
and not suitable to make part of a signature, and the remainder is
short and would cause dozens of false alarms.
However, using the appropriate keywords and wildcards, it is
possible to define a reliable signature. The signature below is
used by TbScan to detect the Haifa.Mozkin virus.
Haifa.Mozkin
com exe ate inf
bh?2bh?109?2*22e80?24l4h75fl
Let's analyze it.
The first line describes the name of the virus.
The second line tells the scanner to search for this signature in
COM and EXE type files. It also tells the scanner that it should
report the file as infected if the signature can be matched. The
keyword ATE instructs the scanner to match this signature only at
the resolved entry-point of the file. The virus starts of course
with decrypting itself, so it is guaranteed that the scanner will
finally reach this location. The ATE instruction limits the scope
of this signature to just one position in a file, so this will
reduce the chances of false alarms significantly.
The third line is the signature definition. Let's reverse engineer
it:
bh?2 This means: a byte in the range B8-BF followed by two
variable bytes. B8-BF is a 'MOV WREG,VALUE'
instruction. From the register we only know it is a
word register, and the value is unknown.
bh?109 This means: another 'MOV WREG,VALUE' instruction. The
register is a word register, and from the value we know
Page 9
Thunderbyte signature compiler. (C) Copyright 1993 Thunderbyte B.V.
that it is in the range 0900 to 09FF.
?2*2 This means: skip two to four bytes. This stuff is
inserted by the virus to make it harder to define a
signature.
2e80?2 This means: the virus performs an arithmethic byte
sized operation with an immediate value (decrypts one
byte) with a CS: segment override. The exact operation,
the memory location and the value are unknown.
4l This means: a byte in the range 40-47. This is a 'INC
WREG' instruction. The viruses increments the counter
to the next byte to be decrypted.
4h This means: a byte in the range 48-4F. This is a 'DEC
WREG' instruction. The viruses decrements the iteration
count.
75fl Opcode 75 is a JNZ instruction. If the decremented
register did not reach zero, the virus jumps back and
repeats the operation. How much does it jump? That
tells the 'fl' part: somewhere between -70h (F0h) to
-77h (F7h) bytes.
Although the signature language of TbGenSig is very powerful, there
are viruses which are so highly polymorphic that they require even
more sophisticated wildcards, keywords or even special detection
algorithms. The explanations however of these wildcards, keywords
or algorithmic detection definitions are so complicated that they
are not suitable for this manual.
Page 10